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Abstract. Over finite words, languages of dot-depth one are expressively complete 
for alternation-free first-order logic. This fragment is also known as the Boolean 
closure of existential first-order logic. Here, the atomic formulas comprise order, 
successor, minimum, and maximum predicates. Knast (1983) has shown that it 
is decidable whether a language has dot-depth one. We extend Knast 's result 
to infinite words. In particular, we describe the class of languages definable in 
alternation- free first-order logic over infinite words, and we give an effective char- 
acterization of this fragment. This characterization has two components. The first 
component is identical to Knast's algebraic property for finite words and the sec- 
ond component is a topological property, namely being a Boolean combination of 
Cantor sets. 

As an intermediate step we consider finite and infinite words simultaneously. We 
then obtain the results for infinite words as well as for finite words as special cases. 
In particular, we give a new proof of Knast's Theorem on languages of dot-depth 
one over finite words. 

1 Introduction 

The investigation of logical fragments has a long history. One of the first results in our direction 
is due to McNaughton and Papert [22] . They showed that a language over finite words is 
definable in first-order logic if and only if it is star-free. A few years earlier, Schiitzenberger 
showed that a language is star- free if and only if its syntactic monoid is aperiodic [28] . For a 
regular language given by a (nondeterministic) finite automaton one can effectively compute 
its syntactic monoid and test for aperiodicity. Combining the result of McNaughton and 
Papert and the result of Schiitzenberger, this gives an algorithm for checking whether a regular 
language is first-order definable. 

The very same approach led to similar decision procedures for various other fragments. 
The motivation for such results is to have some (descriptive) complexity measure for regular 
languages: the simpler a logical formula defining a language, the easier this language is. In 
addition, fragments often admit more efficient algorithms for computational problems such 
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as the satisfiability problem. For example, the satisfiability for full first-order logic is non- 
elementary [30j, whereas the satisfiability problem for first-order logic with only two variables 
is in NEXPTIME [15 j. Moreover, one can frequently find temporal logic counterparts for first- 
order fragments and these temporal logics allow even more efficient algorithms. For example, 
there are temporal logics for first-order logic with two variables having a satisfiability problem 
in NP [9l [21]. The satisfiability problem for most temporal logics is PSPACE-complete, see 

e.g. m- 

When considering some particular logical fragment then there are several main aspects 
of which are interesting: First, which languages are definable in J-", e.g., in first-order logic 
one can define exactly the class of star-free languages. Second, how can one decide whether a 
given regular language is definable in J^, e.g., a language is first-order definable if and only if its 
syntactic monoid is aperiodic. Third, which closure properties does T have, e.g., the inverse 
homomorphic image of a first-order definable language is again first-order definable. Other 
important aspects are given by relations to other fragments and the computational complexity 
of problems such as the satisfiability problem or the model-checking problem for T. In this 
paper, we focus on the first three aspects. Very often, the second aspect is solved by giving 
a decidable algebraic characterization of the syntactic monoid. Apart from pure decidability, 
this also has the advantage that several closure properties come for free by Eilenberg's Variety 
Theorem [12]. 

The algebraic approach has been very successful for finite words [SJ [Ml [33 [S] • It has been 
generalized in different directions. One such direction is to extend the algebraic setting in 
order to be able to characterize more fragments. The syntactic monoid of a language and of its 
complement are identical. Hence, if a fragment is not closed under complementation, then only 
considering the syntactic monoid is not sufficient. To overcome this obstacle, Pin introduced 
ordered monoids and positive varieties [23|. Other fragments, such as stutter-invariant logics, 
are not closed under inverse homomorphisms. The solution to this problem was given by 
Straubing who suggested to use homomorphisms instead of semigroups or monoids. This led 
to the notion of C-varieties \34\ [5]. More recently Gehrke, Grigorieff, and Pin developed a 
general equational theory for regular languages |16| . 

Another way to generalize the algebraic approach is to consider other models than finite 
words such as infinite words [23j , finite trees [21 [H] , Mazurkiewicz traces [H] , or data words [2j , 
just to name a few. In most cases, considering models other than finite words requires a new 
notion of recognition or even new algebraic objects. The characterizations we give in this paper 
rely on an extended notion of recognition based on so-called linked pairs. As it turns out, purely 
algebraic conditions are not sufficient in this setting, but together with a topological property 
they work well. 

When considering language classes for first-order fragments over finite words, there are two 
similar hierarchies within the class of star-free languages which take center stage. The first 
one is the dot-depth hierarchy introduced by Cohen and Brzozowski and the second one 
is the Straubing-Therien hierarchy |31l [5B] . There is a tight connection between the two in 
terms of so-called wreath products \32\ 140] . Both hierarchies are strict p[] and each level 
forms a variety [2B]. Thomas showed that there is a one-to-one correspondence between 
the quantifier alternation hierarchy of first-order logic and the dot-depth hierarchy [38] . This 
correspondence holds if one allows [<, -|-1, min, max] as a signature. The same correspondence 
between the Straubing-Therien hierarchy and the quantifier alternation hierarchy holds if we 
restrict the signature to [<], cf. [26]. In particular, all decidability results for the dot-depth 
hierarchy and the Straubing-Therien hierarchy yield decidability of the membership problem 
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Table 1: Fragments of first-order logic over infinite words F' 



for tlie respective levels of the quantifier alternation hierarchy and vice versa. Unfortunately, 
effectively determining the level of a language in the dot-depth hierarchy or the Straubing- 
Therien hierarchy is one of the most challenging open problems in automata theory. Knast has 
shown that the first level of the dot-depth hierarchy is decidable ^20j, and Simon has given a 
decidable characterization for the first level of the Straubing-Therien hierarchy [29] . These two 
levels and the first two half levels of each hierarchy are the only decidable cases known so far, 
see e.g. |25l for an overview and '17' for level 3/2 of the dot-depth hierarchy. All of the above 
decidability results have been generalized to infinite words [H [lOl [181 US] ; the sole exception 
is dot-depth one. The extension of Knast's result to infinite words is the main purpose of 
this paper. So far, all generalizations for infinite words rely on a combination of algebraic and 
topological properties. As we shall see, dot-depth one is no exception. 

Dot-depth one over finite words corresponds to the Boolean closure of existential first-order 
logic with predicates < for order, +1 for successor, min for first position, and max for last 
position. This fragment is denoted by BSi[<, +1, min, max]. In our setting min and max are 
unary predicates rather than constants because a predicate max also makes sense for infinite 
words. Note that this does not change the expressive power of the fragment BSi and that 
over infinite words the fragments ]BSi[<, +1, min] and BSi[<, +1, min, max] coincide. From 
an algebraic and topological point of view it is more natural to work with finite and infinite 
words simultaneously. However, over F°° = F* U F'^ there is one major difference between 
BSi[<, +1, min] without max and BSi[<, +1, min, max] with max: The latter fragment can 
distinguish finite from infinite words whereas BSi[<, +1, min] cannot differentiate between F* 
and F'^. In particular, every BSi[<, +1, minj-definable language with an infinite word also 
contains finite words, i.e., BSi[<, +1, min] has the finite model property. 

In all variations (with or without max-predicate; infinite words F'^ only or finite and infinite 
words F°°) we obtain the same algebraic characterization Bi as Knast did for finite words. 
In addition, we have a topological condition which is being a finite Boolean combination of 
open sets. Here, open means open in the Cantor topology. This topological property is often 
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denoted by Fo-CiGs, see e.g. [39]. As it turns out, there are two slightly different versions of the 
Cantor topology on T°°. The first one is given by base sets uV^ for u G F*. This corresponds 
to the fragment BSi[<, +1, min] without max over r°°. The second version is given by base 
sets of the form uT^ and {u} for u € F*, i.e., finite words are isolated points. This second 
version yields a characterization of BSi[<, +1, min, max] with max over F°°. In our setting, 
it is more convenient to work with some equivalent linked pair condition instead of using the 
topology itself. 

Related Work 

Various fragments over infinite words have been considered. Existential first-order logic is 
denoted by Si and its Boolean closure is BSi. For two- variable first-order logic we write FO^. 
The second level of the alternation hierarchy is denoted by S2. It contains all formulas in prenex 
normal form with two blocks of quantifiers, starting with a block of existential quantifiers. 
The prefix of a word can be defined in both F0^[<] and S2[<]. Hence, F0'^[<, +1, min] = 
F0^[<,+1] and S2[<,+l,min] = S2[<,+1]. In contrast, Blli[<,+1] is a strict subclass of 
BSi[<, +1, min]. The fragment 112 consists of negations of formulas in S2. Since regular 
languages are effectively closed under complementation, decidability for S2 yields decidability 
for U2. 

An overview of effective characterizations can be found in Table [TJ For the formal definitions 
of the algebraic and topological properties we refer to [lOl UHl |23] . The first decision procedures 
for F0^[<] and FO^ [<,+!] are due to Wilke [l2], and the first effective characterization of 
Il2[<] was given by Bojahczyk [T]. Among the topologies in Table [TJ the Cantor topology is 
the coarsest and the strict factor topology is the finest topology. The relation between the 
other topologies is depicted in Figure [TJ 

^^^^^^^ factor top. 

Cantor top. alphabetic top. strict factor top. 

strict alphabetic top. 

Figure 1: Topologies for infinite words. 



2 Preliminaries 

2.1 Languages 

Throughout, F is a finite nonempty alphabet. The set of finite words over F is denoted by 
F*. The empty word is 1, and F+ = F* \ {1} is the set of finite, nonempty words. The set 
of infinite words is F'^ and F°° = F* U F^ is the set of finite and infinite words. A language 
is a subset of F'". Let L C F* and if C F~. We set LK = {ua e T°° \ u e L, a e K}, 
L* = {ui • • • I n G N, tij G L}, and L'^ = {uiU2 • • • | tij G L}, i.e., L* is the set of finite 
products of words in L and L'^ is the set of infinite products. We have F^ = 1. Let a G F"^ 
and n G F*. The word n is a factor of a if or = vu/3 for some v G F* and /3 G F°°. It is a prefix 
if we can choose v = 1 and it is a suffix if we can choose yS = 1. We write u < a ii u is a 
prefix of a. The length of a is \a\ and we have \a\ G N U {00}. For A; G N, the k-factor alphabet 
of a is alph^(a) = {u G F'^ | a G F*tiF°°}. If X C N, then a{X) is the word comprising all 
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positions of a which are contained in X. By extension, a{x) is the x-th letter of a. Therefore, 
a = Q'(l) • • • a{n) if lal = n G N and a = a{l)a{2) ■ ■ ■ if \a\ = oo. We say that a position x 
of a is covered by a factor u of a factorization a = vuji \i \v\ < x < \vu\. If the position at 
which u occurs in a is clear from the context, then we say that u covers x. Similarly, a set 
of positions is covered by a set of factors if each position is covered by some factor. Here, 
factors are understood with implicit positions of occurrence. A m.onomial is a language of 
the form wiT*W2 ■ ■ ■ ^*Wn, of the form -101^*102 ■ ■ ■ T*WnT'^, or of the form wiT*W2 ■ ■ ■ T*Wnr'^ 
for n > 1 and Wi G F*. The degree of the monomial is ■ ■ ■ A language L C T* of 
finite words has dot-depth one if it is a finite Boolean combination of monomials of the form 
wir*W2 ■ ■ ■T*Wn- Similarly, a language L C T'^ has dot-depth one if it is a finite Boolean 
combination of monomials wiT*W2 ■ ■ ■ T*Wnr'^- 

2.2 First-Order Logic 

We consider first-order logic FO = F0[<, min, max] interpreted over finite and infinite 
words. In the context of logic we think of words as labeled linearly ordered positions. Variables 
range over positions of the word. Atomic formulas are T for true, the unary predicates A(x) = a, 
min(a;) and max(x), and the binary predicates x < y and x = y + 1 for variables x, y and a G F. 
The formula A{x) = a means that x is labeled with a, and the formula min(a:;) (resp. max(x)) 
expresses that x is the first (resp. last) position of the word. The formula x < y \s true if x is 
strictly smaller than y, and x = y + 1 means that x is the successor position of y. Formulas 
can be composed by Boolean connectives and by the quantifiers 3x: ip and Vrc: ^ for ^ G FO. 
The semantics of the connectives is as usual. A sentence is a formula without free variables. 
For a sentence and for a G F°° we write a ^ ^ if ^ interpreted over the word a is true. The 
language defined by ip is L[ip) = {o' G F'" | a \= ip}. 

Let C C {<, +1, min, max}. The fragment of first-order logic consists of all formulas 

in FO in prenex normal form with only one block of existential quantifiers which, apart from 
label-predicates, use only predicates in C. The fragment lBSi[C] contains all finite Boolean 
combinations of formulas in Si[C]. Let L C F'" be a language and J-" be a fragment of first- 
order logic. Then L is definable in T if there exists some sentence ip ^ T such that L = L{ip). 
Sometimes we want to restrict the interpretation of the formula to some subset K C F°°. We 
say that L is definable in J- over K if there is a sentence ip ^ IF with L = {a ^ K \ a \= ip}. 
We frequently use this with = F* or = F'^. Note that max(a;) is false for all positions x of 
an infinite word, i.e., a language L is definable in BSi[C] over F'^ if and only if L is definable 
in BSi[C,max] over F^. 

2.3 Finite Semigroups and Finite IVIonoids 

Let S be a semigroup. An clement .x G 5 is idempotent if x'^ = x. If 5 is finite, then there 
exists a number n > 1 such that the clement x" is idempotent for all x G S*. The monoid 
5"^ generated by S is defined as follows. If 6" is a monoid, then we set = S; otherwise 
= S U {1} is the monoid obtained by adding a new neutral element 1. Green's relations TZ 
and C are an important means for structural analysis in the theory of finite semigroups. For 
x,y E S we set 



x7^y iff xS^ = yS\ 
x£y iff S^x = S^y, 



x<ny iff xS^ C yS^, 
x<cy iff S^xCS^y. 
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Remember that xS^ = {xz | z G 5*^} and S^x = {zx \ z E S^}. We often use these relations 
in the following way: The relation x <ii y holds if and only if there exists z £ such that 
X = yz. Likewise, x <c y if and only if there exists z G such that x = zy. As usual, we 
write X <-ji y x <ti y but not xlZy. The relation <£ is defined similarly. 

A finite semigroup S is in Bi if for all idempotents e, f (z S and for all s,t,x,y £ S we have 

{exfyTexfitesfT = {exfyTesfitesfT 

for n > 1 such that all n-th powers are idempotent in S. A semigroup S is aperiodic if for every 
X £ S there exists n > 1 such that x^ = x^^^ . In the equation for Bi we can set e, /, s, t and y 
to x" which yields x'^x = x". Hence, every semigroup in Bi is aperiodic. Another important 
property of Bi is given in Lemma [3] below. 

The theory of first-order fragments over finite nonempty words is more concise with semi- 
groups rather than with monoids. However, we want to treat finite and infinite words simul- 
taneously, and our approach is heavily based on allowing the empty word 1 (and the fact that 
= 1). On the other hand, it is crucial that the idempotents e and / in the above equation 
for Bi correspond to nonempty words. We therefore consider homomorphisms h :T* ^ M to 
finite monoids. Membership in Bi is then formulated as h{T^) G Bi. 

2.4 Recognizability 

A language L C is regular if it is recognized by an extended Biichi automaton [7], i.e., a 
finite automaton with two sorts of final states; the first sort is for accepting finite words and 
the second is for accepting infinite words by a Biichi condition. Alternatively, a language is 
regular if and only if it is definable in monadic second-order logic . We use a more algebraic 
framework for recognition based on finite monoids. 

Let : r* — )■ M be a homomorphism to a finite monoid M. If h is understood and s € M, 
then we write [s] for the language h~^{s). A linked pair of M is a pair (s,e) £ M x M such 
that e is idempotent and s = se. For every word a G F'" there exists a linked pair (s, e) of M 
such that a € [.s][e]'^ by Ramsey's Theorem [27j. A language L C F°° is recognized by h if 

L = \J{[s] [ef I (s, e) is a linked pair with [s] [ef n L / 0} . 

The syntactic congruence of L C F"" is defined as follows. For nonempty words p,q £ F"*" we 
let p =L (7 if for all words u,v,w G F* the following equivalences hold: 

upvw'^ £ L <^ uqvw'^ £ L and 
uipv)'-' £ L <^ u{qvY £ L. 

Remember that = 1. This relation indeed is a congruence and the congruence classes 
[p]l = {q £ F+ I p=L q} constitute the syntactic semigroup Synt(L). The syntactic monoid 
Synt"'^(L) is the monoid generated by Synt(L), i.e., Synt"'^(L) = for S = Synt(L). The 
syntactic homomorphism hi : T* ^ Synt^(L) is defined by hL{a) = [cJl for a G F. A 
variant of the syntactic monoid is the pure syntactic monoid Synt_|_(L) = Synt(L)U{l}, i.e., 
we add a new neutral element to Synt(L), even if Synt(L) is a monoid. The pure syntactic 
homomorphism /i+ : F* ^ Synt_^(L) is defined by h+{p) = hL{p) for p ^ 1. The only possible 
difference between h^ and /il is their behavior on the empty word. Note that 

/i^(F+) = /i+(F+) = Synt(L) C Synt^(L) C Synt+(L) 
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and Synt^(L) \ {1} = Synt(L) C Synt_,_(L). A language L C r°° is regular if and only if both 
Synt(L) is finite and hi recognizes L, see e.g. [23l|39]. Moreover, L is recognized by its syntactic 
homomorphism /i/, if and only if it is recognized by its pure syntactic homomorphism In 
contrast to hi, the pure syntactic homomorphism has the property that h+{u) = 1 if and only 
if n = 1. 

Lemma 1. Let L C be recognized by a homomorphism h : T* ^ M such that h{u) = 1 
only if u = 1. Then both LOT* and LOT^ are also recognized by h. 

Proof: We have [s] = [s][l]'^ Q T*. Moreover, [s][e]'^ C T'^ if e 7^ 1. This proves the claim. □ 

3 Algebraic Properties 

This section contains simple algebraic and combinatorial properties of the class Bi. The 
following elementary lemma gives a mechanism for obtaining idempotent stabilizers with a 
nonempty preimage: Every sufficiently long word u has a short prefix p admitting a nonempty 
idempotent stabilizer e. 

Lemma 2. Let h : T* ^ M be a homomorphism to a finite monoid M and let n € F* with 
\u\ = |M[ — 1. Then there exists a prefix p of u and an idempotent e G hiV^) with h{p)e = h{p). 

Proof: Let a € F and let 1 = po < Pi < ' ' • < P\M\ = ua be the prefixes of ua. By the 
pigeonhole principle, there exist < i < j < \M\ such that h{pi) = h{pj). In particular, we 
have i < \M\ — 1 and pi is a prefix of u. Let piq = pj for q € F+. We set e = to be the 

idempotent element generated by h{q). Now, h{p)e = h{p) for p = Pi- □ 

Next we state the key property of Bi, a substitution rule valid in certain situations. Much 
of the work in proving our main theorem is devoted to guarantee its premises. 

Lemma 3. Let S € Bi. If u TZ uexf and esfv C v for idempotents e, f & S and for 
u, v,x,s G S, then uexfv = uesfv. 

Proof: Choose n > 1 such that all n-th powers in S are idempotent. Since u TZ uexf and 
V C esfv, there exist y,t G with u = uexfy and v = tesfv. In particular, u = u{exfy)^ 
and V = {tesf)'^v. We can assume y, t & S because e and / are idempotent. Using the equation 
for Bi we conclude 

uexfv = u{exfy)"'exf{tesf)"'v 

= u{exfy)"'esf{tesf)"'v = uesfv. □ 

Proposition m below gives an important combinatorial feature of Bi. It shows that if the 
7^-class changes when reading a word from left to right (resp. the £-class changes when reading 
the word from right to left), then this happens with a new factor of bounded length. 

Proposition 4. Let h : T* Ad be a homomorphism with /i(F^) € Bi and let k > \M\. For 
all a and n, x G F* with \x\ > k we have: 

1. h{u) TZ h{ux) >Ti h{uxa) alph^(x) 7^ alphj;,(xa). 

2. h{u) C h{xu) >£ h{axu) =^ alph^(x) 7^ alph^,(ax). 
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Proof: By left-right symmetry, it suffices to show '[!]'. Assume h{u) TZ h{ux) >fi h{uxa) and 
alph^(x) = alph^(xa). Let w be the suffix of length k of xa. By LemmaO there exist y, z € F* 
with w = yza and h{y)e = h{y) for some idempotent e G h(T~^) because \w\ > \M\. Let \y\ be 
maximal with this property. Since w G alph^(xa) = alph;j,(x), we can write x = syzatz for some 
s, t G r* such that y is a sufhx of yzat. Note that there is indeed at least one letter between 
the two occurrences of z. Let u' = h{usy) and x' = h{zat). We have u' = u'e, u'x' = u'x'e, 
and there exists y' G h{T^) with u' = u'x'y' . Therefore, we have u'x' = u'{ex'ey')'^ex'e{eeee)'^ 
for all n G N, and by h{T^) G Bi this equals n'(ex'ey')"'eee(eeee)" = u' for sufficiently large n. 
Thus u' = u'x' and h(u) TZ u' = u'x'x' TZ h{uxa), contradicting the assumption. □ 

4 The Fragment BSi[<, +1, min] over r°° 

This section contains our main result Theorem [5j We give an effective characterization of the 
first-order fragment BSi[<, min] over finite and infinite words. 

Over r°°, the fragment ]BSi[<, -1-1, min] yields a strict subclass of the ]BSi[<, min, max]- 
definable languages. For example, T*a is not definable in alternation-free first-order logic 
without max-predicate. On the other hand, the language aF°° is definable in BSi[min]. We 
pinpoint this asymmetry of BSi[<, min] to some topological condition (expressed in terms 
of linked pairs). 

Theorem 5. Let L C F"' be regular. The following assertions are equivalent: 

1. L is a finite Boolean combination of monomials of the form wiT*W2 ■ ■ ■ F*7U„F"'. 

2. L is definable in BSi[<, -|-l,min]. 

3. The syntactic homomorphism hi : F* ^ Synt"'^(L) satisfies 

a) Synt(L) G Bi, and 

b) for all linked pairs (s,e) and [t, f) of Synt^{L) with s TZ t we have [s][e]'^ C L 44> 

4- L is recognized by a homomorphism h : T* ^ M satisfying 

a) h{T+) G Bi, and 

b) for all linked pairs (s, e) and (t, /) of M with sTZt we have [s\[e]'^ C L <^ [^][/]'^ ^ 
L. 

Remark 6. Suppose h : T* ^ M recognizes a regular language L and consider the condition 
[s][e]'^ C L [t][/]^ Q L for all linked pairs (s, e) and {t, f) of M with sTZt. This condition is 
equivalent to L being a finite Boolean combination of open sets, cf. \2S\ Theorem VI. 3. 7]. Here, 
open means open in the Cantor topology defined by the base sets uT°° for u (z T* . Therefore, 
the conditions fSU " and fj^ ' Theorem are actually topological properties. 

Remark 7. For languages over F°° there is also the concept of weak recognition. A language 
L is weakly recognized by a homomorphism h : T* ^ M to a finite monoid if 

is a linked pair with [s][e]^ ^ ^} • 

If a language L C T°° is recognized by a homomorphism h, then it is weakly recognized by h. In 
general the converse is not true. However, if in addition [s][e]'^ C L 4^ M[/]'^ ^ fof all linked 
pairs (s, e) and {t, f) of M with sTZt, then weak recognition implies strong recognition. Suppose 
[s\[ef> r\L^%. Then there exists a linked pair [t, f) with [t\[fY^ C L and [s][e]'^ n [t\[ff 7^ 0. 
The latter condition implies sTZt and hence [s][e]" C L. 
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In the remainder of this section we prove Theorem [5l The imphcations '[T]^[2]' and 'l2]=^[3l' 
are Lemmas [HI [9] and [lOl The most involved part '[5] =^ [1]' is shown in the second half of this 
section. 

Lemma 8. Let n > 1 and let wi, . . . ,Wn € T* . 

1. The monomial wiT*W2 ■ ■ ■T*Wn^°° is defined by a sentence in Si[<,+l,min] with quan- 
tifier depth \wi ■ ■ ■ Wnl- 

2. The monomial wiT*W2 ■ ■ ■ ^*Wn is defined by a sentence in Si[<, +1, min, max] with quan- 
tifier depth \wi ■ ■ ■ Wn\- 

Proof: We write = for syntactic equivalence of formulas. For variable vectors x = {xi, . . . ,xi) 
and y = {yi, ■ ■ ■ ,ym) we introduce the shortcuts 3x = 3xi ■ ■ - Bxi, min(j;) = min(xi), 
max(x) = ma.x(xe), x<y = X£ < yi, and A{x) = oi ■ ■ ■ for 

A{xj) = Oj A /\ Xj+i = Xj + 1. 
i<j<e i<j<(- 

Let L = wiT*W2 ■ ■ ■ T*Wn^°° ■ We introduce variable vectors Xj = (x^^i, . . . , Xj |^^|) for every 
i G {1, . . . , n}. Then L is defined by the following sentence ip: 

3xi • • • 3x„ : min(xi) A /\ = Wi A /\ 

l<i<n l<.i<n 

The second term of the conjunction ensures that each Xj corresponds to a factor Wi and the 
first term says that any model starts with wi. The third term makes sure that the factors Wi 
occur in the correct order. The sentence for wiT*W2 ■ ■ ■ r*u;„, is (p A max(x„). □ 

Lemma 9. If L CI r°° is definable in ]BEi[<, +1, min, max], then Synt(L) G Bi. 
Proof: Let e, /, s, t,x,y G r^, let n > 1 and define 

p = (e«xryre"xr(te"5rr, 

q = {e^xf^yTe'^sf^ite^sf^r. 

For all u,v,w G P* and for all sentences tp G Si[<, +1, min, max] with quantifier depth at 
most n, we show upvw^ |= ^ if and only if uqvw'^ \= (p. Let if/ be quantifier free such that 
(p = 3xi ■ ■ ■ 3xn : (A- Suppose upvw^ \= (p and consider positions Xj such that is true. The 
consecutive positions in this assignment induce a sequence of factors wi, . . . , Wm of upvw'^ with 
m < n and \wi\ < n for all i. Since this sequence of nonadjacent factors appears in the same 
order in uqvw'^, we see that uqvw'^ \= (p. Showing that uqvw'^ \= tp implies upvw'^ \= (p is 
symmetric. 

The equivalence of u{pv)'^ \= ip and u{qv)'^ \= ip is similar. Thus the syntactic semigroup of 
every BEi[<, +1, min, max] -definable language is in Bi. □ 

Lemma 10. Let L C T°° be definable in BSi[<, +1, min] and let M be a finite monoid. For 
every surjective homomorphism h : T* ^ M which recognizes L we have [s] C L <^ ['5][e]'^ ^ L 
for every linked pair {s, e) of M. 
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Proof: Let (p E Si[<, +l,min] be a sentence. If a E r"" models tp, then there is a prefix u of 
a such that for every /3 E F"' we have ttyS |= (p. This is because a \= (p yields some satisfying 
assignment for the variables, and positions beyond the last position of this assignment have no 
influence. Let L be defined by a sentence in BSi[<, +1, min] with quantifier depth d. Consider 
a = se'^ for s E [s] and e E [e]. By the above consideration there exists a finite prefix u = se^ 
of a such that a and u model the same formulas in Si[<,+l,min] with quantifier depth at 
most d. Now, It E L if and only if a E L. Therefore, [s] C L if and only if [s][e]^ ^L. □ 

In the remainder of this section we show that condition '(H' in Theorem [S] is sufficient. To 
this end we show that if a and yS are contained in the same monomials up to a certain degree, 
then their images in a semigroup in Bi are 7?.-related. The main idea is to apply Lemma [3j 
The first step is to show that under certain conditions we can replace several factors in finite 
words (Lemma llip. To formulate these conditions we introduce the 7^(/c)-factorization and the 
>C(A;)-factorization. Then the substitution principle in Lemma [TT] is extended to infinite words 
(Lemma I12p. Finally, in Proposition 1131 we show that in Bi we can guarantee the premises of 
Lemma 1121 

We think of a factor Ui as being equipped with the position Xi of its first letter. Consequently, 
a factorization F is a tuple (xi, ui, . . . , x^, ui) E (N x F+)^ with £ > and Xj+i > Xj + \ui\ for 
all 1 < « < ^, i.e., we assume that the factors Ui are in increasing order and nonoverlapping. 
The type of F is the sequence of words (ui, . . . , u^). We say that F is a factorization of a if 
Ui = a{{xi, . . . ,Xj + \ui\ — 1}) for all 1 < z < 

We want to merge two factorizations F = (xi,ni, . . . ,xi,ue) and G = {yi,vi, . . . ,ym,i^m) 
of a. In order to define the join Fy G oi F and G, we combine overlapping factors of F and G 
into one factor, see Figure [2]for an illustration. More precisely, let Xi = {xj, . . . , Xj + \ui\ — 1} 
be the positions of the factor Ui and let Yi = {yi, . . . ,yi + \vi\ — 1} be the positions of the 
factor Vi. We say that X = Ui=i is the set of positions of F. Analogously, Y = IJI^i 
is the set of positions of G. We set Z = X UY. Let {Zi, . . . ,Zn} be the finest partition 
of Z such that every class Zj is a union of sets Xi and sets Yi. Therefore, if x < y < z and 
X, z E Zj, then y G Zj; otherwise we could split Zj into two classes Zj n {s £ N \ s < y} and 
n {s E N I s > y}, resulting in a finer partition. Therefore, each a{Zj) is a factor of a. Let 
Zj be the minimal element in Zj and suppose < • • • < z„. Now, the join of F and G is 

FVG= {zi,a{Zi), . . . ,Zn,a{Zn))- 

It is easy to see that the operation V on factorizations of a is associative and commutative. 

An important algebraic concept in our proofs is the 7^(/c)-factorization and its left-right 
dual, the /^(A;)-factorization. Let /i : F* — ?> M be a homomorphism to a finite monoid M. The 



oi 02 03 Q4 Q5 Q6 a? as Qg aioQiiQi2Qi3Qi4<) • 



Gh 
FVGh 



I — h 

I — h 



Figure 2: The join F V G of the factorizations F and G obtained by merging overlap- 
ping factors. Here, we have F = (1,010203,8,0809010,12,012013) and G = 
(1, 01O2, 6, og, 7, 07O8, 10, o.iooii). The join of these two factorizations is F V G = 
(1,010203,6,06,7,070809010011,12,012013). Note that nonoverlapping adjacent fac- 
tors are not merged. 
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TZ-factorization of a word a is given by the positions where the 7^-class changes when reading 
a from left to right. More precisely, let a = aiwi • • • ar-iWr—iarP with r > 0, G F, tf;^ G V* 
and yS G T°° such that 

h{aiwi ■ ■ ■ a.i)TZ h{aiwi ■ ■ ■ aiWi) >ti h{ai'Wi ■ ■ ■ Oj+i) 

for all 1 < i < r and h{aiwi ■ ■ ■ ar) TZ h{aiwi ■ ■ ■ aj-w) for every finite prefix w of jS. Let Zj 
be the position of in the above factorization. The 7^-factorization of a is {zi 
For every word a, the above factorization is unique and its size r is at most \M\. Note that 
xi = 1 for every nonempty word a, even if h{ai) = 1. 

We extend this definition by taking the contexts of the 7^- factorization into account. Let 
G N and consider the 7^- factorization {zi ) of a. Let Fi = {z[,Wi) with z[ = 

max{l,2;j — k} and Wi = a{{zi — k, . . . , Zi + k}), i.e., Wi is the factor of a induced by all 
positions z such that \z — Zi\ < k. The TZ{k) -factorization of a is Fi V • • • V F^. Let F = 
{xi,ui, . . . , xe,U£) be the 7^(/i;)-factorization of a and let X be the set of its positions. We 
have |X| < |M| {2k + 1) — A; since at most k + 1 positions come from the first position of 
the 7^-factorization and all other positions of the 7^-factorization contribute at most 2k -\- 1 
positions to X. In particular, \X\ < 2k'^ li k > \M\. We have a = uiwi ■ ■ ■ U£^iW£-iUiP 
for some Wi (z T* , JB G F°° such that the Ui's cover the positions of the 7^- factorization and 
moreover, the 7^-class changes at neither the k first positions of any Ui with i > 1 nor at the 
k last positions of any Ui with i < I. 

The /^-factorization of a finite word w G F* is the left-right dual of the 7^-factorization: Let 
w = wiOi ■ ■ ■ Wrttr with r > 0, G F, and wi G F* such that 

h{ai-iWiai ■ ■ ■ Wrttr) <C h{Wiai ■ ■ ■ Wrttr) JC h{ai ■ ■ ■ Wrttr) 

for all 1 < i < r. The C- factorization of w is then given by the factors ai of length one together 
with their positions in w. 

As for 7^- factorizations, we extend this definition by taking contexts into account. Let 
(zi, ai, . . . , Zr, a^) be the /^-factorization of w. Let A; G N and let Gi = {z[,Wi) with z[ = 
maxjljZj — A;} and let Wi = w{{zi — k,. . . ,Zi + k}) be the factor of w induced by all po- 
sitions z such that \z — Zi\ < k. Then, the C{k) -factorization of w is Gi V • • • V Gj-- Let 
G = {yi,vi, . . . , Hm, Vm) be the /2(/c)-factorization of w and let Y be the set of its positions. As 
for 7^(A;)-factorizations, we have |y| < 2A;2 \ik> \M\. 

Lemma 11. Let h : T* M with /i(F+) G Bi and let k > \M\. If u = wqUiWi ■ ■ ■ U£W£ and 
V = wqViWi ■ ■ ■ viwe for words Ui,Vi,Wi G F* such that the Wi 's in u cover the positions of the 
TZ(k) -factorization of u and the Wi's in v cover the positions of the C{k) -factorization of v, 
then h[u) = h{v). 

Proof: The proof goes as follows. Since k is large enough, we find a short prefix pi and a short 
suffix qi of each Wi admitting idempotent stabilizers fi and ej. Appending these prefixes and 
suffixes to the UiS and ViS then allows us to apply Lemma [3l 

We can assume that each Wi covers the positions of a factor of the 7^(A:)-factorization of u or of 
a factor of the /^(/i;)-factorization of v. In particular, \wq\ , \wi\ > k and \wi\ > 2k for < i < i. 
By Lemma [2] and its left-right dual, there exist idempotents /i, . . . , fi, cq, . . . , e^_i G h(T^) 
such that each Wi admits a factorization Wi = piriqi with \pi\ < k and \qi\ < k satisfying 

h{pi) = h{pi) fi for < i < I, 
h{qi) = Si h{qi) for < i < 
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In particular, we can assume po = 1 = qe. Let Xi = Qi-iUiPi and Si = qi-iViPi for 1 < i < ^. 
Then, u = rQXiri ■ ■ ■ x^r^ and s = r^siri ■ ■ ■ Siri, and the r^'s in u cover the positions of the 
7^-factorization of u, whereas the rj's in v cover the positions of the /3-factorization of v. Thus 



h{roXi ■ ■ ■ ri„i) 7^ h{roXi ■ ■ ■ rj_i) • ei^ih{xi)fi 
ei-ih{si)fi ■ h{ri ■ ■ ■ sgre) L h{ri ■ ■ ■ s^re). 



An £-fold apphcation of Lemma [3] yields 



h{u) = h{rQXi 
= h{roXi 
= h{roXi 



ri,_2Xi_iri_iSeri) 



= h{roSi ■ ■ ■ ri_2Si_ire_iSeri) = h{v). 

We can think of the above equations as converting h{u) into h{v) by using substitution rules 
— 7- Sj. Note that the image under h is preserved only when applying these rules from right 



Next, we give a version of Lemma [TT] for finite and infinite words. The problem is that 
there is no canonical choice for the vC(A;)-factorization of an infinite word a. We overcome this 
obstacle by fixing a type and considering i2(A;)-factorizations of this type for infinitely many 
prefixes of a. 

Lemma 12. Let h : T* ^ M with h{T~^) E Bi, let k > \M\ and let a = wqUiWi ■ ■ ■ uiwij with 
Ui,Wi G r* and y € r°° such that the Wi's cover the positions of the TZ{k) -factorization of a. 
Let T be a type such that for every finite prefix p of p ^ there exists g G L* with pq < /3 and 
• the C{k) -factorization G of pq has type r, and 

» pq = iL'ot'iti'i • • • V£W£ for some Vi E T* such that the Wi 's cover the positions of G. 
Then s <fi t for all linked pairs (s,e) and (t, /) of M with a S [s][e]'^ and € 

Proof: Suppose a G [s][e]'^ and /3 S [t][f]'^. We can write /3 G plf]"^ with h{p) = t. By 
assumption, there exists q (zT* such that pq is a prefix of /3 with /3(/c)-factorization G of type r. 
Moreover, we have a factorization pq = wqViWi ■ ■ ■ v^wg such that the positions of G are covered 
by the tUj's. Let r = h{pq). We have r <fi t because p is a prefix oi pq. By Lemma [TT] we have 
h{woUiWi ■ ■ ■ U£W(:) = h{woViWi ■ ■ ■ vgwe). Since we can write a € u)[e]'^ such that h{w) = s 
and wqUiWi ■ ■ ■ u^Wi is a prefix of w, we conclude s <ti r <ti t. □ 

Let G = (yi ,vi,..., Vm) be a factorization. A factorization — (2^1, ui, . . . , X£, ui) is a 
subfactorization of G, denoted hy F ^ G, if for every i £ {1, . . . ,i} there exists j £ {1, . . . , m} 
such that Vj = puiq and Xi = yj + \p\ for some p,q G T*. Intuitively, this means that every 
Ui is covered by some Vj. Let G and G' be factorizations of the same type. Then, there is a 
one-to-one correspondence between the positions of G and the positions of G' . Hence, every 
subfactorization F ^ G induces a subfactorization F' ^ G' . 

For every factorization F = {xi,ui, . . . ,xe,ue) with xi = 1 we define the monomial Pp = 
uiT*U2 ■ ■ ■ T*U£T°° of degree |ui • • • Ui\. Now, whenever F is a factorization of a word a, then 
a £ Pp. The converse does not hold, but if a G Pp, then there exists a factorization F' of a 
with type (ui, . . . Next, we give a canonical way of turning a membership a € Pp into 

such a factorization F'. 



to left. 



□ 
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Let P = uiT*U2 ■ ■ ■ V*utr°° be a monomial and suppose a G P. Write a = uiSiU2 • • • Si-iU£/3 
such that (|si| , . . . , is minimal in the lexicographic order, i.e., we first minimize |si|, 

then |s2|, and so on. We can think of this as greedily minimizing the lengths of the Sj's 
one after another. Now, the greedy factorization for a G P is F' = {xi,ui, . . . ,Xi,U£) with 

Xi = 1 + \U1S1U2 ■ ■ ■ Si-l|. 

Proposition 13. Let h : T* ^ M be a homomorphism with h{T^) G Bi and let a G [s][e]'^ 
and IS G for some linked pairs (s,e) and {t,f) of M. If a and jS are contained in the 

same monomials wiT*W2 ■ ■ ■ T*WrJ^°° of degree at most 4 |M|^, then sTZt. 

Proof: Let k = \M\. We shall first give an intuitive outline of our proof. We consider the 
7?-(A;)-factorization F of a. This converts to a factorization F' of yS. Then we choose a prefix 
pq of p such that its /^(A;)-factorization G' is "as far to the right as possible" in a certain sense. 
Next, the factorization G' of yS is converted into a factorization G of a. This process makes use 
of the factorization F' to ensure that on a the factorization G is sufficiently far to the right 
of F. Using Proposition m the crucial step is to show that F y G and F' V G' have the same 
type. This step was inspired by a proof of Klima [19]. Finally, applying Lemma [T2l we obtain 
s <fi t. Since the situation is symmetric in a and /3, we conclude s TZ t. 

Let F = {xi,ui, . . . ,X£,U£) be the 7^(A;)-factorization of a. Note that a G Pp and the degree 
of Pp is at most Therefore, /3 G Pp by assumption. Let F' = {x[,ui, . . . ,x'^,Ui) be the 
greedy factorization for /3 G Pp- 

There exists a type t such that for every prefix p oi (3 there is a prefix pq of p with an 
/^(/i;)-factorization of type r. If jS is an infinite word, then this means that there are infinitely 
many such prefixes pq of fi. 

Consider some prefix pq of fi with an >C(/c)-factorization G' = {y'i,vi . . . ,ym,Vm) of type 
T such that y' > x' for as many positions y' of G' and positions x' of F' as possible. Let 
H' = F' y G'. We have /3 G Pw and the degree of Ph' is at most Ak'^. Thus a G Ph'- Let 
H be the greedy factorization for a G Ph'- Further, lei G ^ H be the subfactorization of H 
induced by G' ^ H' . Note that we cannot directly transfer the factorization G' of jS to the 
word a because we want that G = {yi,vi, . . . ,yrmi^m) is "sufficiently far to the right". Next, 
we show H = FV G. 

We claim that for all i G {1, . . . ,i}, for all < j < \ui\, and for all r G {1, . . . , m} we have 

Xi + j < yr iff x'^+ j < y'r and 
Xi+j< yr iff 2;- + j < y'^. 

Using property '[T]' of Proposition U we see that F is the greedy factorization for a G Pp. 
Therefore, x[ + j < y'j. in /S implies Xi + j < yr in a. Similarly, x[ + j < y[. in /J implies 
Xi+ j < yr in a. Suppose Xi+ j < yr in a. Let 

J = (Xi, Wi, . . .,Xi,Wi) V {yr,Vr, ■ ■ ■ ,ym,Vm)- 

We have a ^ Pj and the degree of Pj is at most 4A;^. Hence, /3 £ Pj and therefore x'^+ j < y'^ 
by property '[2]' of Proposition H] and by choice oi pq. Suppose Xi+ j < yr in a. If Xi + j < yr, 
then we are done by the previous consideration. So suppose Xi+ j = yr. We have a £ Pj with 
J defined as above. Now, fS £ Pj implies x[+ j < y'^. Note that we cannot conclude x[+j = y'^ 
at this point. This proves the claim. 

The above claim shows that indeed H = F y G. Let pq such that pq < pq < /3 and pq has 
an £(A;)-factorization of type t. Then, by property '[2]' of Proposition [H the factors of the 



13 



>C(A:)-factorization of pq can only lie further to the right than those of the £(fc)-factorization of 
pq. Thus considering the >C(A:)-factorization of pq instead of pq leads to the same factorization 
H of a. Hence, Lemma [12] shows s <tz t. The situation is symmetric in a and JB. Therefore, 
sTZt. □ 

We are now ready to prove Theorem O 
Proof (Proof of Theorem\B^: '[T]=>[2]': By Lemma [HI every monomial wiT*W2 ■ ■ ■ T*Wn^°° is 
definable in Si[<, +1, min]. Hence, the Boolean closure of such languages is contained in the 
Boolean closure of Ei[<, +1, min]. 

'[2]=>[3]': The condition Synt(L) S Bi is shown in Lemma[9j By Lemma [TOl for every linked 
pair (s,e) of Synt^(L) we have [s] C L if and only if [s][e]'^ C L. This is equivalent to the 
condition for linked pairs in '[3b]', see \10\ Proposition 6.4]. The implication '[3]=^[1]' is trivial 
since L is recognized by its syntactic homomorphism. 

'[3]=>[T]' : We write a = if a and yS are contained in the same monomials wiT*W2 ■ ■ ■ T*Wn^°° 
of degree at most 4|M| . Every =-class is a finite Boolean combination of such monomials. 
It therefore suffices to show that jS = or G L implies /S £ L. Suppose a € [s][e]'^ C L and 
jS G for some linked pairs (s,e) and (t,/). By Proposition [T3] we see that a = /3 implies 

sTZt. Thus [t][f]'^ C L and in particular J3 G L. □ 



5 The Fragment BSi[<, +1, min, max] over F* 

In this section we give a new self-contained proof of Knast's result for dot-depth one |20j . 
Another proof was given by Therien [37]. Both Knast's and Therien's proof rely on so-called 
finite categories. Our proof uses only elementary algebraic concepts like Green's relations. The 
main part of the proof builds on Proposition [13] Note that a language L C P* is definable in 
]BSi[<, +1, min, max] over T°° if and only if L is definable in this fragment over T* . 

Theorem 14. Let L C P*. The following are equivalent: 

1. L has dot-depth one, i.e., L is a finite Boolean combination of monomials wiT*W2 ■ ■ ■ P*Wn- 

2. L is definable m BSi[<, +1, min, max]. 

3. Synt(L) G Bi. 

4. L is recognized by some homomorphism /i : P* — > M with h{T^) G Bi. 

Proof: '[!]=> [2]': By Lemma [8] every language of the form wiT*W2 ■ ■ ■T*Wn is definable in 
Si[<, +1, min, max]. Hence, the Boolean closure of such languages is contained in the Boolean 
closure of Si[<, +1, min, max]. '[2]^[3]': This is Lemma [U] The implication '[3]=^[1]' is trivial. 

'[1]=^[T]' : We write u = v if u,v (z T* are contained in the same monomials wiT*W2 ■ ■ ■ P*id„ 
of degree at most 4 |M| . Every =-class is a Boolean combination of such monomials. Thus it 
suffices to show h{u) = h{v) whenever u = v. Applying Proposition [13] with e = / = 1 shows 
h{u) TZ h{v) if u = V. The reversal of a word vu = ai - • • On with Oj G P is = • • • oi. Let u' 
and v' be the reversals of u and v, respectively. Now, u = v implies u' = v' . By Proposition [13] 
we have h{u') TZ h{v') in the reversal of M. This in turn is equivalent to h{u) C h{v) in M. 
Thus h{u) = h{v) since M is aperiodic |231 Proposition A. 2. 9]. Therefore, for every x G M the 
language /i^^(x) is a Boolean combination of monomials. □ 
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6 The Fragment BSi[<, +1, min, max] over F 



In this section, we incorporate the max-predicate. This leads to an effective characterization 
of the first-order fragment BSi[<, +l,min,max] over finite and infinite words. The major 
difference between Theorem [15] below and Theorem [5] is that the "topological" linked pair 
condition is slightly different. To express this new condition, we have to use the pure syntactic 
homomorphism which can distinguish between finite and infinite words. 

Theorem 15. Let L C F'^ be regular. The following assertions are equivalent: 

1. L is a finite Boolean combination of monomials wiV*W2 ■ ■ ■ T*WrX°° and wiT*W2 ■ ■ ■ T*Wn- 

2. L is definable m BSi[<, +l,min,max]. 

3. The pure syntactic homomorphism /i+ : P* ^ Synt_|.(L) satisfies 

a) Synt(L) € Bi, and 

b) for all linked pairs (s, e) and (t, f) of Synt_|_(L) with s TZt and e ^ 1 ^ f we have 
[s][ercL^ [t][frcL. 

4- L is recognized by a homomorphism h : T* ^ M with h{u) = 1 only if u = 1 satisfying 

a) /i(r+) e Bi, and 

b) for all linked pairs (s,e) and {t, f) of M with s TZ t and e ^ 1 ^ f we have 
[s][er<^L ^ CL. 

Before proving Theorem [15] at the end of this section, we give a counterpart of Lemma [10] 
for infinite words. 

Lemma 16. Let L C be definable in ]BSi[<, +1, min, max]. Lf h : T* ^ M is a surjective 
homomorphism recognizing L such that h{u) = 1 only if u = 1, then [s][e]^ C L <^ ^ ^ 

for all linked pairs (s, e) and (t, /) of M with s TZt and e ^ 1 7^ /. 

Proof: Let (p G Si[<, +l,min,max] be a sentence. If a G P'^ models tp, then there is a finite 
prefix u of a such that for every yS € P'^ we have u/3 \= ip. This is because a \= (p yields 
some satisfying assignment for the variables, and positions beyond the last position of this 
assignment have no influence. 

Let L be defined by a formula with quantifier depth d, let t = sx and s = ty for x,y & M. 
Consider ai = se'^ for s G [s] and e € [e], and let x G [x], y G [y], and / G [/]. By 
the above consideration, there exists a finite prefix u = se" of ci such that ySi = uxf^ 
models at least the same formulas in Si[<, +1, min, max] with quantifier depth at most d as 
ai does. Similarly, there exists a prefix v = uxf"^ of ySi such that a2 = vy e'^ models at 
least the same formulas in Si[<, +1, min, max] with quantifier depth at most d as fSi does. We 
continue this process and construct ai, ySi, q'2i 1^2-, ■■■ such that each word satisfies at least 
the same formulas with quantifier depth d as its predecessor. There are only finitely many 
nonequivalent Si[<, +1, min, max]-formulas with quantifier depth at most d. Hence, there 
exist words a, G [s]e'^ and ySj G [t]f'^ which satisfy the same formulas in Si[<, +1, min, max] 
with quantifier depth at most d. Now, G -L if and only if jSj G L. This yields [s][e]'^ C L if 
and only if [t][/]'^ C L. □ 

Combining Theorem [5] Theorem 1 141 and Lemma [16] yields the following proof of Theorem [15] 
Proof (Proof of Theorem \15\): '[T]=^[2]': By Lemma [8] every monomial wqT*wi ■ ■ ■T*Wn^°° 
or woT*wi ■ ■ ■ T*Wn is definable in Si[<, +1, min, max]. Therefore, the Boolean closure of such 
languages is contained in BSi[<, +1, min, max]. 
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'l2]=^[3l': Let L be defined by ^ G ]B5]i[<, +1, min, max]. Lemma [9] shows Synt(L) € Bi. 
The condition ' ISbP for the linked pairs fohows from Lemma [T6l 

'l3]=^[l]': This is trivial, since /i+ : L* — >■ Synt_,_(-L) recognizes L and h+ maps only the 
empty word to 1 € Synt_|_(L). 

'11]=>[T]' : Consider = LnL'^ and let Loo be the union of the language and the following 
language over finite words: 

[J {[s] I n [s\[e]'^ / for some linked pair (s, e) of M} . 

Then Loo satisfies condition 'SI' in Theorem [5] for the homomorphism h and hence Loo is a 
finite Boolean combination of monomials wiT*W2 ■ ■ ■T*WrX°° ■ Since is a finite Boolean 
combination of languages T*a and ar°° for a G L, we see that L^ = Loo HL'^ is a finite Boolean 
combination of monomials wiT*W2 ■ ■ ■ r*?i'„r'" and wiT*W2 ■ ■ ■ T*Wn- Consider L^.. = L T*. 
Lemma [T] shows that L^, is recognized by h. Therefore, L is a finite Boolean combination of 
monomials wqT*wi ■ ■ ■ T*'Wn by Theorem 1141 Thus L = L^:Li L^^ is of the required form. □ 

7 The Fragment BSi[<, +1, min] over 

If we consider infinite words only, the predicate max is always false. Hence, the first-order frag- 
ments BSi[<, +1, min, max] and BSi[<, +1, min] coincide. In this section we give an effective 
characterization of this fragment for infinite words. It is a rather straightforward consequence 
of Theorem [T^l 

Theorem 17. Let L C F'^ he regular. The following assertions are equivalent: 

1. L has dot-depth one, i.e., L is a finite Boolean combination of monomials of the form 

2. L is definable m BSi[<, +1, min] overV^. 

3. The pure syntactic homomorphism /i+ : F* ^ Synt_^(L) satisfies 

a) Synt(-L) G Bi, and 

h) for all linked pairs (s, e) and (t, f) of Synt_|_(L) with slZ t and e ^ 1 ^ f we have 
[s][e]-CL ^ [t][fr<^L. 

4. L is recognized by a homomorphism /i : F* — t- M with h{u) = 1 only if u = 1 satisfying 

a) /i(F+) G Bi, and 

b) for all linked pairs {s,e) and {t,f) of M with s IZ t and e ^ 1 ^ f we have 

[s][er<zL^ mrcL. 

Proof: '[T]=^[21': If L is a Boolean combination of monomials wiT*W2 ■ ■ ■ F*u;„F'^, then L can 
also be written as a Boolean combination of monomials wiT*W2 ■ ■ ■ F*ri;„F°° and T*a for a G F. 
By Theorem 1151 the language L is definable in B$]i[<, +1, min, max] over F'". Since max is 
false for all positions of an infinite word, L is definable in BSi[<, +l,min] over F'^. 

'l2]=>[3]': Let L be definable in the fragment BSi[<, +1, min] over F'^. Then L is definable 
in BSi[<, +1, min, max] over T°° and by Theorem 1151 the claim follows. '13]=>|1]': Trivial. 

'11]=^[T]': Let Loo be the union of L and the following language over finite words 

IJ {[s] I L n [s][e]'^ / for some linked pair (s, e) of M} . 

Now, Loo satisfies condition 'H]' in Theorem [5] for the homomorphism h and we obtain that Loo 
is a Boolean combination of monomials wiT*W2 • • • T*Wn^°°- Moreover, L = Loo H F^ and L is 
a Boolean combination of monomials wiT*W2 ■ ■ ■ T*Wn^'^- D 
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Fragment Models Languages Algebra + Linked Pairs 



BI]i[<,+l,miii] 




B{wir* • • 




Bi 


I- 7?.-closed 


Thm.E] 


BEi [<, +1, mill, max] 




IwiL ■ ■ 


•r*w„r"i 
• r*wn i 


Bi 


1- 7?.+-closed 


Thm.Iin] 


BEi [<, +1, min, max] 




M{wiT* ■ 


■■r*w„} 


Bi 




[20], Thm.IH 


BSi[<,+l,miii] 




B{wir* • 




Bi - 


1- 7?.+-closed 


Thm.im 



Table 2: Characterizations of the fragment BSi for various signatures and models 



Since condition '[31' in Theorem [T7] is decidable, we obtain the following corollary. 

Corollary 18. It is decidable whether a regular language L C F'^ has dot-depth one. □ 

Remark 19. Another algebraic framework for infinite words are oj-semigroups 123]. An co- 
semigroup {S+,Saj) has two components. The first component S-^. is a semigroup equipped 
with an infinite product operation and Soj is the set of results of infinite products. The condi- 
tions f^" in Theorem [73 and f3" in Theorem [7^ are equivalent to saying that the syntactic 
oj-semigroup (S'+,S'tj) satisfies S+ € Bi and {x'^y^Yx'^ = {x'^y^Yy'^ in S^j for all x,y G S+, 
cf. f23l Theorem VI. 3. 8 (6)]. Here, x" € 5"+ denotes the idempotent generated by x and x^ 
is an infinite product. The two components of an co-semigroup inevitably distinguish between 
finite nonempty and infinite words. Therefore, co-semigroups are only suitable for fragments 
which can distinguish finite from infinite words. In particular, BSi[<, +1, min] cannot distin- 
guish between finite and infinite words and condition f3" in Theorem [3| is not an equational 
co-semigroup condition. 

8 Summary 

In Table[2]we summarize our results on alternation-free first-order logic BSi. We gave classes of 
languages for which BSi[<, +1, min] and BSi[<, +1, min, max] are expressively complete. Our 
main results are characterizations of the syntactic homomorphisms of such languages. These 
characterizations are combinations of algebraic and topological properties. The topological 
properties are stated in terms of linked pairs. 

An entry "7^-closed" in the column "Linked Pairs" of Table [2] stands for the equivalence 
[s][e]'^ C L 44> [i][/]'^ C L for all linked pairs (s,e) and {t,f) with s IZ t in the syntactic 
monoid. For "7^+-closed" this equivalence has to hold for the pure syntactic homomorphism 
and e / 1 7^ /. 

Over F°° there are two variants of the Cantor topology. The first one is defined by the 
base sets mF°° for u € F*, and base sets for the second one are uT^ and {u}. A regular 
language is a finite Boolean combination of Cantor sets of the first kind if and only if its 
syntactic homomorphism is "7^-closed". Boolean combinations of Cantor sets of the second 
kind correspond to "7^'''-closed" . 

In all cases, the combination of the algebraic and the topological properties gives decidability 
of the membership problem for the respective fragment. 
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